Algorithmic Aspects of Natural Language Processing

نویسنده

Mark-Jan Nederhof

چکیده

Examples of natural languages are Chinese, English and Italian. They are called natural as they evolved in a more or less natural way, without too many deliberate considerations. This sets them apart from formal languages, amongst which are programming languages, which are designed to allow easy processing by computer algorithms. Typically, programs in programming languages such as C or Java can be processed (compiled) in close to linear time in their length. One particular feature that most programming languages have in common, and that allows for their fast processing, is absence of ambiguity. That is, only one structure, called a parse or parse tree, can be assigned to any program, and this parse can have only one meaning. Furthermore, the design of many programming languages is such that the single parse can be found deterministically, which means that every parsing step contributes a fragment of the resulting parse. As parses have a size linear in the length of the input, this explains why parsing is possible in linear time. Subsequent processing of the parse, for example in order to compile to machine code, is also commonly possible in close to linear time. Natural languages are quite different in this respect. Like programs in a programming language, sentences in a natural language can be assigned parses, but often the sentences are ambiguous and allow more than one parse. Even for a single parse, there may be ambiguity in the meanings of words or expressions. The existence of ambiguity in natural language is witnessed by frequent misunderstandings in daily life, but it is also an essential feature of poetry and puns.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Composition by Conversation

Most musical programming languages are developed purely for coding virtual instruments or algorithmic compositions. Although there has been some work in the domain of musical query languages for music information retrieval, there has been little attempt to unify the principles of musical programming and query languages with cognitive and natural language processing models that would facilitate ...

متن کامل

Weighted Automata in Text and Speech Processing

Finite-state automata are a very effective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned weights or costs. We briefly describe some of the main theoretical and algorithmic aspects of these machines. In particular, we describe an efficient composition alg...

متن کامل

Weighted Automata in Text

Processing Mehryar Mohri, Fernando Pereira and Michael Riley AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ fmohri,pereira,[email protected] Abstract. Finite-state automata are a very e ective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned ...

متن کامل

Natural Language Semantics and Computability

This paper is a reflexion on the computability of natural language semantics. It does not contain a new model or new results in the formal semantics of natural language: it is rather a computational analysis of the logical models and algorithms currently used in natural language semantics, defined as the mapping of a statement to logical formulas — formulas, because a statement can be ambiguous...

متن کامل

Border Crossings

It is well established by now that computer science has a number of concerns in common with natural language understanding. Common themes show up in particular with algorithmic aspects of text processing. This chapter gives an overview of border crossings from NLP to CS and back. Starting out from syntactic analysis, we trace our route via a philosophical puzzle about meaning, Hoare correctness...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Algorithmic Aspects of Natural Language Processing

نویسنده

چکیده

منابع مشابه

Composition by Conversation

Weighted Automata in Text and Speech Processing

Weighted Automata in Text

Natural Language Semantics and Computability

Border Crossings

عنوان ژورنال:

اشتراک گذاری